Feel free to see accompanied PowerPoint slides for explanations.

1. Load data

Check and remove duplicate rows

Check missing values

No missing values.

2. Exploratory Data Analysis

2.1 Numerical features - univariate analysis

2.2 Numerical features - bivariate analysis with income

2.3 Numerical features - multivariate analysis

2.4 Categorical features - univariate analysis

2.5 Categorical features - bivariate analysis with income

Plot education in ascending edu level order for better display

2.6 Other bivariate analysis

3. Data pre-processing and feature engineering

3.1 Drop features

3.2 Treat outliers

3.3 Standardization

3.4 One-hot encoding

Observation:

3.5 Handling imbalanced dataset

4. Model Building

4.1 Logistic Regression

4.2 Decision Tree

4.3 Random Forest

4.4 XGBoost

With all default parameters of XGBClassifier()

With hyperparameters tuning

4.5 Light GBM

With all default parameters of LGBMClassifier()

With hyperparameters tuning

5. Explainable AI

5.1 Permutation Feature Importance

5.2 SHAP plots

Global explanation

Local explanation

5.3 Partial Dependence Plots (global explanation)

5.4 Lime plots (local explanation)